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Abstract 

RockIt is a maximum a-posteriori (MAP) query en- 
gine for statistical relational models. MAP inference 
in graphical models is an optimization problem which 
can be compiled to integer linear programs (ILPs). We 
describe several advances in translating MAP queries 
to ILP instances and present the novel meta-algorithm 
cutting plane aggregation (CPA). CPA exploits local 
context-specific symmetries and bundles up sets of lin- 
ear constraints. The resulting counting constraints lead 
to more compact ILPs and make the symmetry of 
the ground model more explicit to state-of-the-art ILP 
solvers. Moreover, RoCKlT parallelizes most parts of 
the MAP inference pipeline taking advantage of ubiqui- 
tous shared-memory multi-core architectures. 
We report on extensive experiments with Markov logic 
network (MLN) benchmarks showing that RoCKlT 
outperforms the state-of-the-art systems ALCHEMY, 
Markov TheBeast, and Tuffy both in terms of ef- 
ficiency and quality of results. 



Introduction 

Maximum a-posteriori (MAP) queries in statistical rela- 
tional models ask for a most probable possible world given 
evidence. In this paper, we will present novel principles and 
algorithms for solving MAP queries in SRL models. MAP 
inference (or, alternatively, MPE inference) is an important 
type of probabilistic inference problem in graphical mod- 
els which is also used as a subroutine in numerous weight 
learning algorithms ( |Lowd and Domingos 2007| l. Being able 
to answer MAP queries more efficiently often translates to 
improved learning performance. 

Since Markov logic ( Richardson and Domingos 2006| l is 
arguably the most widely used statistical relational learn- 
ing (SRL) formalism, we use Markov logic as represen- 
tation formalism throughout this paper. However, the pro- 
posed approach is also applicable to numerous alternative 
SRL formalisms. Due to its expressiveness and declarative 
nature, numerous real-world problems have been modeled 
with Markov logic. Especially in the realm of data man- 
agement applications such as entity resolution (Singla and 
Domingos 2006|l, data integration (jNiepert, Meilicke, and 
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Stuckenschmidt 2010 Niepert 20I0[ l, ontology refinement 
(|Wu and Weld 2008|l, and information ex traction ( jPoon and] 
Domingos 2007 Kok and Domingos 2008 ), Markov logic 
facilitates rapid prototyping and achieves competitive em- 
pirical results. 

The main contributions of the presented work are as 
follows. First, we present a more compact compilation of 
MAP problems to integer linear programs (ILPs) where each 
ground clause is modeled by a single linear constraint. Sec- 
ondly, we introduce cutting plane aggregation (CPA). CPA 
exploits evidence-induced symmetries in the ground model 
that lead to context-specific exchangeability of the random 
variables. CPA results not only in more compact ILPs but it 
also makes the model's symmetries more explicit to sym- 
metry detection heuristics of state of the art ILP solvers. 
Thirdly, we parallelize most parts of the MAP query pipeline 
so as to leverage multi-core architectures. Finally, we have 
designed and implemented the presented theory resulting 
in a novel and robust MLN engine RockIt that integrates 
cutting plane aggregation with cutting plane inference. Nu- 
merous experiments on established benchmarks show that 
CPA leads to significantly reduced run times. RockIt out- 
performs the state-of-the-art systems ALCHEMY, MARKOV 
TheBeast, and Tuffy both in terms of running time and 
quality of results on each of the MLN benchmarks. 

Related Work 

MaxWalkSAT (MWS), a random walk algorithm for solv- 
ing weighted SAT problems (|Kautz, Selman, and Jiang] 
I997| l, is the standard inference algorithm for MAP queries 
in the Markov logic engine ALCHEMY ([Domingos et al.] 
2012| l. The system TUFFY ( |Niu et al. 20Il| l employs re- 
lational database management systems to ground Markov 
logic networks more efficiently. TUFFY also runs MWS on 
the ground model which it initially attempts to partition into 
disconnected components. 

MAP queries in SRL models can be formulated as integer 
linear programs (ILPs). In this context, cutting plane infer- 
ence (CPI) solving multiple smaller ILPs in several itera- 
tions has shown remarkable performance ( Riedel 2008| l. In 
each CPI iteration, only the ground formulas violated by the 
cuiTent intermediate solution are added to the ILP formu- 
lation until no violated ground formulas remain. Since CPI 
ignores ground formulas satisfied by the evidence, it can be 



seen as a generalization of pre-processing approaches that 
count the formulas satisfied by the evidence ("S havlik and| 
Natarajan 2009|l. In the context of max-margin weight learn- 



ing for MLNs ^Huynh an d Mooney 2009 ) the MAP query 
was formulated as a linear relaxation of an ILP and a round- 
ing procedure was applied to extract an approximate MAP 
state. RockIt's ILP formulation requires less constraints 
for ground clauses with negative weights and it combines 
CPI with cutting plane aggregation. 

There is a large class of symmetry-aware algorithms for 
SRL models. Examples of such lifted inference algorithms 
include first-order variable elimination (FOVE) (|Poole 



12003 ) and some of its extensions ( Milch et al. 2008 Kisyn 
^ki and Poole 2009) making use of counting and aggre 
gation parfactors. FOVE has also been adapted to solve 



MAP problems (FOVE-P) (de Salvo Braz, Amir, and Roth 
12006). (Apsel and Brafman 20121 introduced an approach 



for MAP inference that takes advantage of uniform assign- 
ments which are groups of random variables that have iden- 
tical assignments in some MAP solution. Automorphism 
groups of graphical models were used to lift variational ap- 



proximations of MAP inference (Bui, Huynh, and Riedel 
[2012) . We attempted to compute automorphism groups as 
an alternative method for aggregating constraints but ex- 
periments showed that calling a graph automorphism algo- 
rithm in each CPI iteration dominated the overall solving 
time. (Mladenov, Ahmadi, and Kersting 2012) computed ap- 
proximate solutions to linear programs by reducing the LP 
problem to a pairwise MRF over Gaussians and applying 
lifted Gaussian belief propagation. Similar to the approach 
of ( |Bui, Huynh, and Riedel 20T2) l lifted linear programming 
can be used to approximate LP relaxations ( jAsano 2006| l of 
the MAP ILP. Contrary to previous work, RockIt uses a 
more compact ILP formulation with a one-to-one correspon- 
dence between ground clauses and linear constraints, tightly 
integrates CPI and CPA, and estimates the optimal aggrega- 
tion scheme avoiding a costly exact computation in each CPI 
iteration. Moreover, contrary to lifted inference approaches 
operating solely on the first-order level, RockIt exploits 
evidence-induced local symmetries on the ground level. 

There are several lifted marginal inference approaches 
such as hfted message passing ([Singla and Domingos 2008t 



Kersting, Ahmadi, and Natarajan 2009 1, variants of lifted 
knowledge compilation and theorem proving ( Van den 
[Broeck 201 1[ |Gogate and Domingos 201 l|l, and lifted 



MCMC (Niepert 2012 Venugopal and Gogate 2012) ap- 



proaches. While there are some generic parallel machine 
learning architectures such as GraphLab ( [Lowetal. 2010| l 
which could in principle be used for parallel MAP inference, 
RockIt is the first system that parallelizes MAP inference 
in SRL models combining CPI and CPA. 

Markov Logic 

Markov logic is a first-order template language combining 
first-order logic with log-linear graphical models. We first 
review function-free first-order logic dGenesereth and Nils 



|son 1987] l. Here, a term is either a constant or a variable. An 
atom p{ti, ■■■,tn) consists of a predicate p/n of arity n fol- 
lowed by n terms t^. A literal £ is an atom a or its negation 



-^a. A clause is a disjunction ^i V ... V ^fe of literals. The 
variables in clauses are always assumed to be universally 
quantified. The Herbrand base H is the set of all possible 
ground (instantiated) atoms. Every subset of the Herbrand 
base is a Herbrand interpretation. 

A Markov logic network A^ is a finite set of pairs 
{Fi, Wi),l < i < n, where each Fi is a clause in function- 
free first-order logic and Wi S M. Together with a finite set 
of constants C — {ci, ..., c„} it defines the ground Markov 
logic network A4c with one binary variable for each ground- 
ing of predicates occurring in Ai and one feature for each 
grounding of formulas in A4 with feature weight w^. Hence, 
a Markov logic network defines a log-linear probability dis- 
tribution over Herbrand interpretations (possible worlds) 



-P(x) = ^ cxp I ^ w,n,{x.) 



(1) 



where n^ (x) is the number of satisfied groundings of clause 
Fi in possible world x and Z is a normalization constant. 

In order to answer a MAP query given evidence E = e 
one has to solve the maximization problem 

arg max P(X = x | E = e) 

X 

where the maximization is performed over possible worlds 
(Herbrand interpretations) x compatible with the evidence. 

Cutting Plane Aggregation 

Each MAP query corresponds to an optimization problem 
with linear constraints and a linear objective function and, 
hence, we can formulate the problem as an instance of in- 
teger linear programming. The novel cutting plane aggrega- 
tion approach is tightly integrated with cutting plane infer- 
ence (CPI) a meta-algorithm operating between the ground- 
ing algorithm and the ILP solver (Riedel 2008). Instead of 
immediately adding one constraint for each ground formula 
to the ILP formulation, the ILP is initially formulated so as 
to enforce the given evidence to hold in any solution. Based 
on the solution of this more compact ILP one determines 
the violated constraints, adds these to the ILP, and resolves. 
This process is repeated until no constraints are violated by 
an intermediate solution. 

We begin by introducing a novel ILP formulation of MAP 
queries for Markov logic networks. In contrast to existing 
approaches ( [Riedel 2008[ [Huynh and Mooney 2 009), the 
formulation requires only one linear constraint per ground 
clause irrespective of the ground clause being weighted or 
unweighted. Moreover, we introduce the notion of context- 
specific exchangeability and describe the novel cutting plane 
aggregation (CPA) algorithm that exploits this type of local 
symmetry. Contrary to most symmetry-aware and lifted in- 
ference algorithms that assume no or only a limited amount 
of evidence, the presented approach specifically exploits 
model symmetries induced by the given evidence. 

General ILP Formulation 

In order to transform the MAP problem to an ILP we have to 
first ground, that is, instantiate, the first-order theory speci- 
fied by the Markov logic network. Since we are employing 



weight 


ground clause 


max l.lzi — 0.522 
subject to 


1.1 


Xl V -^X2 V Xs 


^ a;i + (1 — X2) + 2:3 > ^1 


-0.5 


-^Xi V X2 


(1 — X-i) + X2 < 2 ■ Z2 


00 


-^Xi V X2 


(l-xi) + a;2 > 1 



Table 1 : An example of the ILP formulation. 



cutting plane inference, RockIt runs in each iteration sev- 
eral join queries in a relational database system to retrieve 
the ground clauses violated by the current solution. Hence, 
in each iteration of the algorithm, RockIt maintains a set 
of ground clauses Q that have to be translated to an ILP in- 
stance. 

Given such a set of ground clauses Q, we associate one 
binary ILP variable X( with each ground atom i occurring in 
some g e Q. For the sake of simplicity, we will often denote 
ground atoms and ILP variables with identical names. For a 
ground clause g E Q let L+ (g) be the set of ground atoms 
occurring unnegated in g and L~{g) be the set of ground 
atoms occurring negated in g. Now, we encode the given ev- 
idence by introducing linear constraints of the form X( < 
OT Xl > 1 depending on whether the evidence sets the cor- 
responding ground atom £ to false or true. For every ground 
clause g E G with weight w > 0, w E M., we add a novel 
binary variable Zg and the following constraint to the ILP; 



eeL+{g) 



Xi 



E 

eeL-(g) 



(1 - Xi) > Zg 



Please note that if any of the ground atoms i in the ground 
clause is set to false (true) by the given evidence, we do not 
include it in the linear constraint. 

For every g with weight Wg < 0, w E M.,we add a novel 
binary variable Zg and the following constraint to the ILP: 



eeL+{g) 



Xi 



E 

eeL-{g) 



{l-Xi)<{\L+{g)\ + \L-{g)\)zg 



For every g with weight Wg = 00, that is, a hard clause, 
we add the following linear constraint to the ILP: 



E ^^ 

eeL+{g) 



E 

eeL-{g) 



(l-2:f)>l 



If a ground clause has zero weight we do not have to add 
the corresponding constraint. 

Finally, the objective of the ILP is: 



max^ WnZ 
geg 



9-^9' 



where we sum over weighted ground clauses only, Wg is the 
weight of g, and Zg E {0, 1} is the binary variable previously 
associated with ground clause g. We compute a MAP state 
by solving the ILP whose solution corresponds one-to-one 
to a MAP state x where Xi = true if the corresponding 
ILP variable is 1 and Xi — false otherwise. 

For example. Table [T] depicts three clauses with w > Q, 
w < 0, and w = 00, and the respective ILP formulations. 



Constraint Aggregation 

In this section we optimize the compilation of sets of 
weighted ground clauses to sets of linear constraints. More 
concretely, we introduce a novel approach that aggregates 
sets of ground clauses so as to make the resulting ILP have 
(a) fewer variables (b) fewer constraints and (c) its context- 
specific symmetries more exposed to the ILP solver's sym- 
metry detection heuristics. 

We first demonstrate that evidence often introduces sym- 
metries in the resulting sets of ground clauses and, there- 
fore, at the level of ILP constraints. The proposed approach 
aggregates ground clauses, resulting in smaller constraint 
matrices and aiding symmetry detection algorithms of the 
ILP solvers. The solvers apply heuristics to test whether 
the ILP's constraint matrix exhibits symmetries in form 
of permutations of its columns and rows. For a compre- 
hensive overview of existing principles and algorithms for 
detecting and exploiting symmetries in integer linear pro 
grams we refer the reader to (Margot 2010 Margot 2003 



|Ostrowski et al. 201 I I'Bodi, Herr, and Joswig 20I3i We de 
scribe cutting plane aggregation in two steps. First, we ex- 
plain the aggregation of ground formulas and, second, we 
describe the compilation of aggregated formulas to ILP con- 
straints. 

Definition 1 Let G C Q be a set of n weighted ground 
clauses and let c be a ground clause. We say that G can be 
aggregated with respect to c if (a) all ground clauses in G 
have the same weight and (b)for every gt E G,l < i < \G\, 
we have that gi = i^M c where (.^ is a (unnegated or negated) 
literal for each i,l < i < \G\. 

Example 1 Table^lists a set of 5 ground clauses. The set of 
clauses {gi , 172 , gafcan be aggregated with respect to -^yi V 
j/2 since we can write each of these ground clauses as £i V 
~^yi V 1/2 with £1 := Xi,^2 '■= X2, and £3 := -^x^. 

Before we describe the advantages of determining ground 
clauses that can be aggregated and the corresponding ILP 
formulation encoding these sets of clauses, we provide a typ- 
ical instance of a Markov logic network resulting in a large 
number of clauses that can be aggregated. 

Example 2 Let us consider the clause ^smokes(a;) V 
cancer(a:) and let us assume that there are 100 constants 
Ci, ..., Cioo for which we have evidence smokes(Ci), 1 < 
i < 100. For 1 < i < 100, let jji be the ILP variable cor- 
responding to the ground atom cancer(Cj). The naive for- 
mulation would contain 100 constraints yi > Zi and the 
objective of the ILP with respect to these clauses would be 
maxl.5zi + ... + 1.5zioo- Instead, we can aggregate the 
ground clauses cancer(Ci),l < i < 100, fore = false 
andii = cancer(Ci), 1 < i < 100. 

Let G C CJ be a set of ground clauses with weight w and 
let c be a ground clause. Moreover, let us assume that G can 
be aggregated with respect to c, that is, that each g E G 
can be written as £i V c. The aggregated feature f'^ for the 
clauses G with weight w maps each interpretation / to an 
integer value as follows 



f{I) 



\{t. 



\G\ 
\JceG\ 



I\^i^}\ 



if I^C 

Otherwise 
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e.i 
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51 


ajiV 


^yi V y2 


1.0 


92 


a;2V 


^yi V 2/2 


1.0 


93 


-•X3V 


^yi V 2/2 


1.0 


94 


-ia;4V 


^yi V 2/3 


1.0 


95 


a^sV 


^yi 


0.5 



^i 


c 


m 


2;2V 

^X-3V 


^yi V y2 


1.0 


^2:4V 


^yi V y3 


1.0 


a;5V 


^yi 


0.5 



Table 2: A set of ground clauses that can be aggregated. 



max O.Szi — 1.522 
subject to 



e. 


c 


w 


xiV 

x^y 


^yi 


0.5 


2:2V 

^2;3V 


yi V ^2/2 


-1.5 



xi +a;2 + a;3 + 3(1 ^ 


-yi)>^i 


(l-xi) + a;2 + (l- 

3-yi<z2 

3 ■ (1 - y2) < Z2 


- 23) < Z2 



Table 3: Illustration of the constraint aggregation formula- 
tion. For the sake of simplicity, we denote the ground atoms 
and ILP variables with identical names. 
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Stuck S = {FuF2,F3,Fi,Fi,} 
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{11,13} 



Preprocessing 



Find violated constraints 
and perform CPA 



Add aggregated 
constraints to ILP 



Parallel branch & bound 



Return MAP state 



Figure 1: RockIt parallelizes constraint finding, constraint 
aggregation, and ILP solving. 



The feature resulting from the aggregation, therefore, 
counts the number of literals £i that are satisfied if the 
ground clause c is not satisfied and returns the number of 
aggregated clauses otherwise. Please note that an encoding 
of this feature in a factor graph would require space expo- 
nential in the number of ground atoms even though the fea- 
ture only has a linear number of possible values. The fea- 
ture, therefore, is highly symmetric - each assignment to the 
random variables corresponding to the unnegated (negated) 
literals that has the same Hamming weight results in the 
same feature weight contribution. This constitutes a feature- 



specific local form of finite exchangeability (Finetti 1972 
Diaconis 1977| of random variables induced by the evi 



dence. Therefore, we denote this form of finite exchange- 
ability as context-specific exchangeability. Please note that 
the concept is related to counting formulas used in some 
lifted inference algorithms (Milch et al. 2008[ ). While stan- 
dard models such as factor graphs cannot represent such 
symmetric features compactly, one can encode these count- 
ing features directly with a constant number of ILP con- 
straints. We now describe this translation in more detail. 

As before, for any ground clause c, let L'^{c) {L^{c)) be 
the set of ground atoms occurring unnegated (negated) in 
c. We first show the formulation for clauses with positive 
weights. Let G C CJ be a set of n ground clauses with weight 
w > that can be aggregated with respect to c, that is, for 
each g E G we have that g ~ Xi V c 01 g — ^x^ V c for 
some ground atom Xi and a fixed clause c. We now add the 
following two linear constraints to the ILP: 

^ x^+ J2 (1-2^0 + 

y nxi + y. '^(1 ~ ^i) ^ ^g (2) 

iGL + (c) leL-(c) 

and 



Zg <n 



(3) 



Linear constraint (2) introduces the novel integer variable 
Zg for each aggregation. Whenever a solution satisfies the 
ground clause c this variable has the value n and otherwise it 
is equal to the number of literals ii satisfied by the solution. 
Since constraint (2) alone might lead to values of Zg that are 
greater than n, the linear constraint (3) ensures that the value 
of Zg is at most n. However, linear constraint (3) only needs 
to be added if clause c is not the constant false. 

We describe the aggregation of clauses with negative 
weight. Let G C C/ be a set of n ground clauses with weight 
u; < that can be aggregated with respect to c, that is, for 
each (7 e G we have that g — xiW c or g — ^xi V c for 
a ground atom xi and a fixed clause c. We now add the fol- 
lowing linear constraints to the ILP: 

^ x,+ Y^ {l-x,)<zg (4) 

{xiyc)eG {-•xiWc)eG 

nx£ < Zg for every £ E L'^ (c) (5) 

n{l — xg) < Zg for every £ E L~ (c). (6) 

Linear constraint (4) introduces an integer variable Zg that 
counts the number of ground clauses in G that are satisfied. 
For each of the integer variables Zg representing an aggre- 
gated set of clauses we add the term WgZg to the objective 
function where Wg is the weight of each of the aggregated 
clauses. Table [3] shows a set of aggregated ground clauses 
and the corresponding ILP formulation. It is not difficult 
to verify that each solution of the novel formulation corre- 
sponds to a MAP state of the MLN it was constructed from. 

Example 3 In Example^we aggregate the ground clauses 
cancer(Gi),l < i < 100, /or c — false and £i = 
cancer(Gi),l < « < 100. Now, instead of 100 linear 
constraints and 100 summands in the objective function 
the aggregated formulation only adds the linear constraint 
2/1 + ... + yioo < Zg and the term 1.5zg to the objective. 
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22 
24 
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4 
17 


evidence 
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12,892 258,079 
390,720 340,737 


1,031 

354,587 


12,999 

40,234,321 


99,161 
202,215 


Number of ILP constraints 


w/o CPA 
w/ CPA 


357,056 4,041 
10,782 932 


31,658 
6,617 


2,688,122 
1,573 


164,047 
10,064 



Table 4: Characteristics of the ML benchmark datasets and 
the number of constraints of the respective ILP formulations. 



We observed that the computation and aggregation of 
the violated constraints often dominated the ILP solving 
time. Since, in addition, state-of-the-art ILP solvers such as 
GUROBI already parallelize their branch and bound based al- 
gorithms we developed an additional method for paralleliz- 
ing the CPI and CPA phases. 

Parallelism and Implementation 

We will (a) briefly explain the parallelization framework 
of RockItP] (b) outline the implementation of the aggre- 
gation strategy, and (c) provide some details about the third- 
party components we employed. 

Figure [T] depicts the computational pipeline of the sys- 
tem. After pre-processing the input MLN and loading it into 
the relational database system, RockIt performs CPI it- 
erations until no new violated constraints are found. The 
violated constraints are computed with joins in the rela- 
tional database system where each table stores the predicate 
groundings of the intermediate solutions. In each CPI iter- 
ation, RockIt performs CPA on the violated constraints. 
We can parallelize the aggregation steps by processing each 
first-order formula in a separate thread. To this end, each 
first-order formula is initially placed on a stack S. RockIt 
creates one thread per available core and, when idle, makes 
each of the threads (i) pop a first-order formula from the 
stack S, (ii) compute the formula's violated groundings, 
and (iii) perform CPA on these groundings. The aggregated 
groundings are compiled into ILP constraints and added to 
the ILP formulation. When the stack S is empty and all 
threads idle we solve the current ILP in parallel, obtain a 
solution, and begin the next CPI iteration. 

There are different possible strategies for finding the 
ground clauses c (see Definition [T]) that minimize the num- 
ber of counting constraints per first-order formula. While 
this problem can be solved optimally with algorithms 
that detect symmetries in propositional formulas such as 
Saucy ( [Darga, Sakallah, and Markov 200 8^ or, alterna- 
tively, by reducing it to a frequent itemset mining problem 
for which several algorithms exist, these algorithms need 
to be called in each CPI iteration and for each first-order 
formula and extensive experiments showed that these algo- 
rithms dominated ILP solving. Therefore, we implemented a 
greedy algorithm that only estimates the optimal aggregation 




Figure 3: Running time of RockIt with CPA on varying 
number of cores. Due to space considerations we only show 
the results for the lowest gap 10"^° for ER, IE, RC, and PR 
benchmarks and the gap 0.01 for the LP benchmark. 



scheme. The algorithm stores, for each first-order clause, the 
violated groundings of the form £i V . . . V ^„ in a table 
with n columns where each column represents one literal 
position of the clause. For each column k, RockIt com- 
putes the set of distinct rows R^ of the table that results 
from the projection onto the columns {1, ..., n} \ {k}. Let 
d — arg minj.{|i?fe|}. The clause groundings are then ag- 
gregated with respect to the rows in i?^. 

RockIt employs MySQL's in-memory tables for the 
computation of violated constraints and the aggregation. 
Most tables are hash indexed to facilitate highly efficient 
join processing. We use GUROB j^as RockIt's internal ILP 
solver due to its ability to parallelize its branch, bound, and 
cut algorithm, its remarkable performance on standard ILP 



benchmarks (Koch et al. 2011 1, and its symmetry detection 



heuristics. In addition to the project code, RockIt is made 
available as a web-service where users can upload MLNs. 
Furthermore, programmers can integrate the MLN query en- 
gine in their own appHcations via a REST interface. 

Experiments 

With the following experiments we assess whether and to 
what extent RockIt (a) reduces the number of constraints 
in the ILP formulation, (b) reduces the overall runtime, and 
(c) outperforms state-of-the-art MLN systems. We compare 
RockIt to three MLN systems Alchemy (IDomingos et 
|al. 20I2| i, Markov TheB east ( |Riedel 2008| l, and Tuffy 
(version 3) ( |Niuetal. 2011| l. To ensure a fair comparison, we 
made MARKOV TheBeast also use the ILP solver Gurobi. 
In addition, we investigate whether and to what extent (d) 
RockIt's performance increases with the number of avail- 
able cores of a shared-memory architecture. All experiments 
were performed on a standard PC with 8 GB RAM and 2 
cores with 2.4 GHz each unless otherwise stated. 

We used several established benchmark MLNs for the 
empirical evaluation. The entity resolution (ER) MLN ad- 
dresses the problem of finding records corresponding to 
the same real-world entity (Singla and Domingos 2006|). 



http://code.google.eom/p/rockit/ 
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Figure 2: Running time in seconds of RockIt, Alchemy, Markov TheBeast, and Tuffy for different gaps (bounds on 
the relative error) and with two cores. * gap not reached within 1 hour, ** out of memory, *** did not terminate within 1 hour. 



The information extraction (IE) ( |Poon and Domingos 2007) l 
MLN was created for the extraction of database records 
from text or semi-structured sources. The link prediction 
MLN (LP) was built to predict the relations holding be- 
tween faculty, staff, and students of several university de- 
partments ((Richardson and Domingos 2006). The protein 
interaction (PR) MLN was designed to predict interactions 
between proteins, enzymes, and phenotypes. The relational 
classification (RC) MLN performs a classification on the 
CORA (McCal lum et al. 2000| l dataset. In ( Niu et al. 20TT] l 
the MLN was used to compare the performance of TUFFY to 
the Alchemy system. The ER, LP, IE, and RC MLNs were 
downloaded from the TUFFY website and the PR MLN from 
the Alchemy website. The formula weights were learned 
with Alchemy. Table HI summarizes the properties of the 
five MLN benchmarks. 

Only applying CPI without CPA already lead to signif- 
icantly more compact ILPs. For the IE benchmark, for in- 
stance, the number of constraints was reduced from 342, 279 
to 4, 041. When both CPI and CPA were used, the num- 
ber of ILP constraints was further reduced for each of the 
MLNs. The largest reduction in the number of constraints 
was achieved for the PR MLN. Here, the number of con- 
straints was reduced from 2, 688, 122 to 1, 573. Table|4]lists 
the number of ground clauses and the number of ILP con- 
straints with CPI and with/without CPA. 

In order to compare the performance of the MLN sys- 
tems, we measured the time needed to compute an interpre- 
tation whose weight (the sum in equation (1)), has a rela- 
tive error of at most e = 10^^, r G {1, 2, 3, 10}, with re- 
spect to the optimal weight. To this end, we used RockIt 
to compute an ILP solution whose objective value has a rel- 
ative error of at most 10^^° and computed the actual weight 
LUia of the interpretation corresponding to this ILP solu- 
tion. From this value we computed Ur for r e {1, 2, 3} by 
multiplying lUJio with 1 — 10^'^. The MLN systems were 
run, for each r G {1,2,3,10}, with an increasing num- 



ber of MaxWalkSAT flips for ALCHEMY and TUFFY or 
with decreasing values of Gurobi's MIPGap parameter for 
Markov TheBeast and RockIt until a parameter con- 
figuration achieved an interpretation weight of at least utr, 
or until one hour had passed, whichever came first. Figure l2] 
summarizes the results for the four different gaps. 

Using CPA was always more efficient except for the IE 
benchmark where the average running time remained almost 
identical. The largest decrease in running time due to CPA, 
from 84 to 13 seconds, was observed for the PR dataset. 
RockIt was more efficient and was often able to compute 
a higher objective than TuFFY, MARKOV TheBeast, and 
Alchemy. In all cases, RockIt was able to compute an in- 
terpretation with highest weight in less time. We conjecture 
that RockIt without CPA is more efficient than Markov 
TheBeast because of RockIt's more compact ILP formu- 
lation and the parallelization of the CPI phase. For the ER, 
PR, and RC dataset TuFFY and Alchemy were not able to 
achieve the same approximation as RockIt. Alchemy did 
not finish grounding within one hour on the RC dataset and 
ran out of memory on the PR dataset. For the LP dataset, no 
system was able to achieve a gap of 0.001 or lower. 

Figure [3] compares the runtime of RockIt with CPA for 
different number of cores. For each benchmark, the runtime 
decreases when the number of cores increases with a dimin- 
ishing reduction in runtime. The LP benchmark has the high- 
est relative decrease of about 53% when comparing the run- 
ning times on 1 and 8 cores. 

Conclusions 

We presented RoCKiT, a system for parallel MAP infer- 
ence in SRL models combining CPI and cutting plane ag- 
gregation (CPA). CPA is a novel algorithm that aggregates 
symmetric constraints. Extensive experiments showed that 
RockIt is more efficient than existing MLN systems. Fu- 
ture work will investigate more aggregation strategies and 
the combination of CPA with lifted inference approaches. 
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